AI Agents Fall Short in Freelance Market, Raising Questions About Job Displacement
Scale AI and the Center of AI Research revealed a stark reality: AI agents failed to complete 97% of tasks on Upwork to even a basic standard. Six AI models attempted 240 projects across writing, design, and data analysis, with the top performer, Manus, succeeding in just 2.5% of cases—earning a mere $1,810 of the $143,991 available.
The study highlights fundamental limitations. AI struggles with multi-step workflows, lacks initiative, and falters in applying judgment. Claude Sonnet and Grok 4 managed only 2.1% task completion, reinforcing researchers' consensus that job replacement remains distant.
Separate findings from the European Broadcasting Union and BBC compound these concerns. Major AI models like ChatGPT and Gemini frequently produce inaccurate news content, with 45% of responses containing significant errors. Gemini's outputs were particularly problematic, exhibiting issues in 76% of cases.